Eigenoption Discovery through the Deep Successor Representation
نویسندگان
چکیده
Options in reinforcement learning allow agents to hierarchically decompose a task into subtasks, having the potential to speed up learning and planning. However, autonomously learning effective sets of options is still a major challenge in the field. In this paper we focus on the recently introduced idea of using representation learning methods to guide the option discovery process. Specifically, we look at eigenoptions, options obtained from representations that encode diffusive information flow in the environment. We extend the existing algorithms for eigenoption discovery to settings with stochastic transitions and in which handcrafted features are not available. We propose an algorithm that discovers eigenoptions while learning non-linear state representations from raw pixels. It exploits recent successes in the deep reinforcement learning literature and the equivalence between proto-value functions and the successor representation. We use traditional tabular domains to provide intuition about our approach and Atari 2600 games to demonstrate its potential.
منابع مشابه
The Eigenoption-Critic Framework
Eigenoptions (EOs) have been recently introduced as a promising idea for generating a diverse set of options through the graph Laplacian, having been shown to allow efficient exploration Machado et al. [2017a]. Despite its first initial promising results, a couple of issues in current algorithms limit its application, namely: 1) EO methods require two separate steps (eigenoption discovery and r...
متن کاملDeep Unsupervised Domain Adaptation for Image Classification via Low Rank Representation Learning
Domain adaptation is a powerful technique given a wide amount of labeled data from similar attributes in different domains. In real-world applications, there is a huge number of data but almost more of them are unlabeled. It is effective in image classification where it is expensive and time-consuming to obtain adequate label data. We propose a novel method named DALRRL, which consists of deep ...
متن کاملOn the Sequentiality of the Successor Function
Let U be a strictly increasing sequence of integers. By a greedy algorithm, every nonnegative integer has a greedy U -representation. The successor function maps the greedy U -representation of N onto the greedy U -representation of N+1. We characterize the sequences U such that the successor function associated to U is a left, resp. a right sequential function. We also show that the odometer a...
متن کاملDeep Successor Reinforcement Learning
Learning robust value functions given raw observations and rewards is now possible with model-free and model-based deep reinforcement learning algorithms. There is a third alternative, called Successor Representations (SR), which decomposes the value function into two components – a reward predictor and a successor map. The successor map represents the expected future state occupancy from any g...
متن کاملRunning head: SUCCESSOR REPRESENTATION and TEMPORAL CONTEXT The Successor Representation and Temporal Context
The successor representation was introduced into reinforcement learning by Dayan (1993) as a means of facilitating generalization between states with similar successors. Although reinforcement learning in general has been used extensively as a model of psychological and neural processes, the psychological validity of the successor representation has yet to be explored. An interesting possibilit...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1710.11089 شماره
صفحات -
تاریخ انتشار 2017